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Notions of minimal sufficient causation are incorporated within 
the directed acyclic graph causal framework. Doing so allows for the 
graphical representation of sufficient causes and minimal sufficient 
causes on causal directed acyclic graphs while maintaining all of the 
properties of causal directed acyclic graphs. This in turn provides a 
clear theoretical link between two major conceptualizations of causal- 
ity: one counterfactual-based and the other based on a more mecha- 
nistic understanding of causation. The theory developed can be used 
to draw conclusions about the sign of the conditional covariances 
among variables. 

1. Introduction. Two broad conceptualizations of causality can be dis- 
cerned in the literature, both within philosophy and within statistics and 
epidemiology The first conceptualization may be characterized as giving an 
account of the effects of certain causes; the approach addresses the question, 
"Given a particular cause or intervention, what are its effects?" In the con- 
temporary philosophical literature, this approach is most closely associated 
with Lewis' work [17, 18] on counterfactuals. In the contemporary statistics 
literature, this first approach is closely associated with the work of Rubin 
[30, 31] on potential outcomes, of Robins [25, 26] on the use of counterfac- 
tual variables in the context of time-varying treatment and of Pearl [21] on 
the graphical representation of various counterfactual relations on directed 
acyclic graphs. This counterfactual approach has been used extensively in 
statistics both in the development of theory and in application. The second 
conceptualization of causality may be characterized as giving an account of 
the causes of particular effects; this approach attempts to address the ques- 
tion, "Given a particular effect, what are the various events which might have 
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been its cause?" In the contemporary philosophical literature, this second 
approach is most notably associated with Mackie's work [19] on insufficient 
but necessary components of unnecessary but sufficient conditions (INUS 
conditions) for an effect. In the epidemiologic literature, this approach is 
most closely associated with Rothman's work [29] on sufficient-component 
causes. This work is more closely related to the various mechanisms for a 
particular effect than is the counterfactual approach. Rothman's work on 
sufficient-component causes has, however, seen relatively little development, 
extension or application, though the basic framework is routinely taught 
in introductory epidemiology courses. Perhaps the only major attempt in 
the statistics literature to extend and apply Rothman's theory has been the 
work of Aickin [1] (comments relating Aickin's work to the present work are 
available from the authors upon request). 

In this paper, we incorporate notions of minimal sufficient causes, cor- 
responding to Rothman's sufficient-component causes, within the directed 
acyclic graph causal framework [21]. Doing so essentially unites the mecha- 
nistic and the counterfactual approaches into a single framework. As will be 
seen in Section 5, we can use the framework developed to draw conclusions 
about the sign of the conditional covariances among variables. Without the 
theory developed concerning minimal sufficient causes, such conclusions can- 
not be drawn from causal directed acyclic graphs. In a related paper [35] we 
have discussed how these ideas relate to epidemiologic research. The present 
paper develops the theory upon which this epidemiologic discussion relies. 

The theory developed in this paper is motivated by several other con- 
siderations. As will be seen below, the incorporation of minimal sufficient 
cause nodes allows for the identification of certain conditional independen- 
cies which hold only within a particular stratum of the conditioning vari- 
able (i.e., "asymmetric conditional independencies," [7]) which were not evi- 
dent without the minimal sufficient causation structures. We note that these 
asymmetric conditional independencies have been represented elsewhere by 
Bayesian multinets [7] or by trees [3]. Another motivation for the develop- 
ment of the theory in this paper concerns the notion of interaction. Prod- 
uct terms are frequently included in regression models to assess interactions 
among variables; these statistical interactions, however, even if present, need 
not imply the existence of an actual mechanism in which two distinct causes 
both participate. Interactions which do concern the actual mechanisms are 
sometimes referred to as instances of "synergism" [29], "biologic interac- 
tions" [32] or "conjunctive causes" [20], and the development of minimal 
sufficient cause theory provides a useful framework to characterize mecha- 
nistic interactions. In related work [37] we have derived empirical tests for 
interactions in this sufficient cause sense. 

As yet further motivation, we conclude this Introduction by describing 
how the methods we develop in this paper clarified and helped resolve an 
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Fig. 1. Causal directed acyclic graph under the alternative hypothesis of familial coag- 
gregation. 

analytic puzzle faced by psychiatric epidemiologists. Consider the follow- 
ing somewhat simplified version of a study reported in Hudson et si. [10]. 
Three hundred pairs of obese siblings living in an ethnically homogenous 
upper-middle class suburb of Boston are recruited and cross classified by 
the presence or absence of two psychiatric disorders: manic-depressive dis- 
order P and binge eating disorder B. The question of scientific interest is 
whether these two disorders have a common genetic cause, because, if so, 
studies to search for a gene or genes that cause both disorders would be 
useful. Consider two analyses. The first analysis estimates the covariance (5 
between P21 and Bu, while the second analysis estimates the conditional 
covariance a between Pn and Bu among subjects with Pu = 1, where B^ 
is 1 if the kth sibling in the ith family has disorder B and is zero otherwise, 
with P^i defined analogously. It was found that the estimates (5 and a were 
both positive with 95% confidence intervals that excluded zero. 

Hudson et al. [10] substantive prior knowledge is summarized in the di- 
rected acyclic graph of Figure 1 in which the i index denoting family is 
suppressed. In what follows, we will make reference to some standard re- 
sults concerning directed acyclic graphs; these results are reviewed in detail 
in the following section. 

In Figure 1, Gp and Gp represent the genetic causes of B and P, respec- 
tively, that are not common causes of both B and P. The variables E\ and 
Ei represent the environmental exposures of siblings 1 and 2, respectively, 
that are common causes of both diseases, for example, exposure to a partic- 
ularly stressful school environment. The variables Gb and Gp are assumed 
independent as would typically be the case if, as is highly likely, they are 
not genetically linked. Furthermore, as is common in genetic epidemiology, 
the environmental exposures E\ and E2 are assumed independent of the 
genetic factors. The causal arrows from Pi to B\ and P2 to B2 represent 
the investigators' beliefs that manic-depressive disorder may be a cause of 
binge eating disorder but not vice-versa. The node F represents the com- 
mon genetic causes of both P and B as well as any environmental causes of 
both P and B that are correlated within families. There is no data available 
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for Gp, Gp, E\, E2 or F. The reason for grouping the common genetic 
causes with the correlated environmental causes in F is that, based on the 
available data {Pki, B^; i = 1, . . . , 300, k = 1, 2}, we can only hope to test 
the null hypothesis that F so defined is absent, which is referred to as the 
hypothesis of no familial coaggregation. If this null hypothesis is rejected, 
we cannot determine from the available data whether F is present due to a 
common genetic cause or a correlated common environmental cause. Thus 
E\ and E2 are independent on the graph because, by definition, they repre- 
sent the environmental common causes of B and P that are independently 
distributed between siblings. 

Now, under the null hypothesis that F is absent, we note that P2 and 
Bi are still correlated due to the unblocked path P2 — G p — Pi — Bi, so we 
would expect (3 7^ as found. Furthermore, P2 and Bi are still expected 
to be correlated given Pi = 1 due to the unblocked path P2 — G p — P\ — 
Ei — B\, so we would expect a 7^ as found. Thus, we cannot test the null 
hypothesis that F is absent without further substantive assumptions beyond 
those encoded in the causal directed acyclic graph of Figure 1 . 

Now Hudson et al. [10] were also willing to assume that for no subset of the 
population did the genetic causes G p and Gp of P and B prevent disease. 
Similarly, they assumed there was no subset of the population for whom 
the environmental causes E\ and E2 of B and P prevented either disease. 
We will show in Section 5 that under these additional assumptions, the null 
hypothesis that F is absent implies that the conditional covariance a must 
be less than or equal to zero, provided that there is no interaction, in the 
sufficient cause sense, between E and Gp. If it is plausible that no sufficient 
cause interaction between E and Gp exists, then the null hypothesis that 
F is absent is rejected because the estimate of a is positive with a 95% 
confidence interval that does not include zero. 

Thus, the conclusion in the argument above that familial coaggregation 
of diseases B and P was present depended critically on the existence of (i) 
a formal definition of a sufficient cause interaction, (ii) a substantive under- 
standing of what the assumption of no sufficient cause interaction entailed, 
and (iii) a sound mathematical theory that related assumptions about the 
absence of sufficient cause interactions to testable restrictions on the distri- 
bution of the observed data, specifically on the sign of a particular condi- 
tional covariance. In this paper, we provide a theory that offers (i)-(iii). 

The remainder of the paper is organized as follows. The second section 
reviews the directed acyclic graph causal framework and provides some basic 
definitions; the third section presents the theory which allows for the graph- 
ical representation of minimal sufficient causes within the directed acyclic 
graph causal framework; the fourth section gives an additional preliminary 
result concerning monotonicity; the fifth section develops results relating 
minimal sufficient causation and the sign of conditional covariances; the 
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sixth section provides some discussion concerning possible extensions to the 
present work. 

2. Basic definitions and concepts. In this section, we review the directed 
acyclic graph causal framework and give a number of definitions regarding 
sufficient conjunctions and related concepts. Following Pearl [21], a causal 
directed acyclic graph is a set of nodes (X±, . . . ,X n ), corresponding to vari- 
ables, and directed edges among nodes, such that the graph has no cycles 
and such that, for each node X{ on the graph, the corresponding variable is 
given by its nonparametric structural equation X% = fi(pai, e«), where pai are 
the parents of Xi on the graph and the 6j are mutually independent random 
variables. These nonparametric structural equations can be seen as a gener- 
alization of the path analysis and linear structural equation models [21, 22] 
developed by Wright [43] in the genetics literature and Haavelmo [9] in the 
econometrics literature. Robins [27, 28] discusses the close relationship be- 
tween these nonparametric structural equation models and fully randomized, 
causally interpreted structured tree graphs [25, 26]. Spirtes, Glymour and 
Schemes [33] present a causal interpretation of directed acyclic graphs out- 
side the context of nonparametric structural equations and counterfactual 
variables. It is easily seen from the structural equations that (X±, . . . ,X n ) 
admits the following factorization: p(X\ , . . . , X n ) = J\2=i Pi-^i \pa,i). The non- 
parametric structural equations encode counterfactual relationships among 
the variables represented on the graph. The equations themselves represent 
one-step ahead counterfactuals with other counterfactuals given by recur- 
sive substitution. The requirement that the be mutually independent is 
essentially a requirement that there is no variable absent from the graph 
which, if included on the graph, would be a parent of two or more variables 
[21, 22]. 

A path is a sequence of nodes connected by edges regardless of arrowhead 
direction; a directed path is a path which follows the edges in the direction 
indicated by the graph's arrows. A node C is said to be a common cause of 
A and B if there exists a directed path from C to B not through A and a 
directed path from C to A not through B. A collider is a particular node 
on a path such that both the preceding and subsequent nodes on the path 
have directed edges going into that node. A backdoor path from A to B is 
a path that begins with a directed edge going into A. A path between A 
and B is said to be blocked given some set of variables Z if either there is 
a variable in Z on the path that is not a collider or if there is a collider on 
the path such that neither the collider itself nor any of its descendants are 
in Z . If all paths between A and B are blocked given Z, then A and B are 
said to be d-separated given Z. It has been shown that if all paths between 
A and B are blocked given then A and B are conditionally independent 
given Z [8, 13, 40]. 
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Suppose that a set of nonparametric structural equations represented by 
a directed acyclic graph H is such that its variables X are partitioned into 
two sets X = V U W . If in the nonparametric structural equation for V U W, 
by replacing each occurrence of Xi 6 W by fi(pai,ei), the nonparametric 
structural equations for V can be written so as to correspond to some causal 
directed acyclic graph G, then G is said to be the marginalization of H 
over the set of variables W . A causal directed acyclic graph with variables 
X = V U W can be marginalized over W if no variable in W is a common 
cause of any two variables in V . 

In giving definitions for a sufficient conjunction and related concepts, 
we will use the following notation. An event is a binary variable taking 
values in {0, 1}. The complement of some event E we will denote by E. A 
conjunction or product of the events X±, . . . , X n will be written as X\ ■ ■ ■ X n . 
The associative OR operator, V, is defined by AV B = A-\- B — AB. For a 
random variable A with sample space £1 we will use the notation A = 
to denote that A(lo) = 0, for all uj € f2. We will use the notation 1a=o, to 
denote the indicator function for the random variable A taking the value 
a; for some subset S of the sample space 0, we will use lg to denote the 
indicator that u £ S. We will use the notation ^4]Ji?|C to denote that A 
is conditionally independent of B given C. We begin with the definitions of 
a sufficient conjunction and a minimal sufficient conjunction. These basic 
definitions make no reference to directed acyclic graphs or causation. 

Definition 1. A set of events X\, . . . ,X n is said to constitute a suffi- 
cient conjunction for event, D if Xi, . . . , X n = 1 => D = 1. 

Definition 2. A set of events X\, . . . , X n which constitutes a sufficient 
conjunction for D is said to constitute a minimal sufficient conjunction for 
D if no proper subset of X±, . . . ,X n constitutes a sufficient conjunction for 
D. 

Sufficient conjunctions for a particular event need not be causes for an 
event. Suppose a particular sound is produced when and only when an in- 
dividual blows a whistle. This particular sound the whistle makes is a suf- 
ficient conjunction for the whistle's having been blown, but the sound does 
not cause the blowing of the whistle. The converse, rather, is true; the blow- 
ing of the whistle causes the sound to be produced. Corresponding then to 
these notions of a sufficient conjunction and a minimal sufficient conjunction 
are those of a sufficient cause and a minimal sufficient cause which will be 
defined in Section 3. 

Definition 3. A set of events Mi, . . . , M n , each of which may be some 
product of events, is said to be determinative for some event D if D = 
Mi V M 2 V ■■■ V M n . 
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Fig. 2. Causal directed acyclic graphs with sufficient causation structures. 



Definition 4. A determinative set Mi,...,M n of (minimal) sufficient 
conjunctions for D is nonredundant if no proper subset of M\, . . . ,M n is 
determinative for D. 

Example 1. Suppose A = B V CE and D = EF. If we consider all the 
minimal sufficient conjunctions for A among the events {B,C,D}, we can 
see that B and CD are the only minimal sufficient conjunctions, but it is not 
the case that A = B V CD. Clearly then, a complete list of minimal sufficient 
conjunctions for A generated by a particular collection of events may not be 
a determinative set of sufficient conjunctions for A. If we consider all minimal 
sufficient conjunctions for A among the events {B,C, D,E}, we see that B 
and CD and CE are all minimal sufficient conjunctions. In this example, 
B V CD V CE is a determinative set of minimal sufficient conjunctions for 
A but is not nonredundant. We see then that even when a complete list of 
minimal sufficient conjunctions generated by a particular collection of events 
constitutes a determinative set of minimal sufficient conjunctions, it may not 
be a nonredundant determinative set of minimal sufficient conjunctions. 

3. Minimal sufficient causation and directed acyclic graphs. In this sec- 
tion, we develop theory which allows for the representation of sufficient 
conjunctions and minimal sufficient conjunctions on causal directed acyclic 
graphs. We begin with a motivating example. 

Example 2. Consider a causal directed acyclic graph given in Figure 
2(i). Suppose E\E% an d E3E4 constitute a determinative set of sufficient 
conjunctions for D. We will show in Theorem 1 below that it follows that the 
diagram in Figure 2 (ii) is also a causal directed acyclic graph where EiEj 
is simply the product or conjunction of Ei and Ej\ because the sufficient 
conjunctions E\Ei and E3E4 are determinative, it follows that D = E1E2V 
E3E4. An ellipse is put around the sufficient conjunctions E1E2 and E3E4 
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to indicate that the set is determinative. As will be seen below, in order 
to add sufficient conjunctions it is important that a determinative set of 
sufficient conjunctions is known or can be constructed. Consider the causal 
directed acyclic graph given in Figure 2(iii). Suppose that no determinative 
set of sufficient conjunctions can be constructed from E\ and E2 alone; 
suppose further, however, that there exists some other cause of D, say A, 
independent of E\ and E2, such that E1E2 and AE2 form a determinative 
set of sufficient conjunctions. Then, Theorem 1 below can again be used to 
show that Figure 2(iv) is a causal directed acyclic graph. Furthermore, it 
will be shown in Theorem 2 that for any causal directed acyclic graph with a 
binary node which has only binary parents, a set of variables {^4i}™ =0 always 
exists such that a determinative set of sufficient causes can be formed from 
the original parents on the graph and the variables {^I^Lo - 

Theorem 1 provides the formal result required for the previous example. 

Theorem 1. Consider a causal directed acyclic graph G with some node 
D such that D and all its parents are binary. Suppose that there exists a set 
of binary variables Aq,...,A u such that a determinative set of sufficient 
conjunctions for D, say M\, . . . ,Ms, can be formed from conjunctions of 
Aq, . . . , A u along with the parents of D on G and the complements of these 
variables. Suppose further that there exists a causal directed acyclic graph 
H such that the parents of D on H that are not on G consist of the nodes 
Aq, . . . , A u and such that G is the marginalization of H over the set of vari- 
ables which are on the graph for H but not G. Then, the directed acyclic 
graph J formed by adding to H the nodes M\, . . . ,M$, removing the di- 
rected edges into D from the parents of D on H, adding directed edges from 
each Mi into D and adding directed edges into each Mi from every parent 
of D on H which appears in the conjunction for Mi is itself a causal directed 
acyclic graph. 

Proof. To prove that the directed acyclic graph J is a causal directed 
acyclic graph, it is necessary to show that each of the nodes on the directed 
acyclic graph can be represented by a nonparametric structural equation 
involving only the parents on J of that node and a random term 6j which 
is independent of all other random terms Ej in the nonparametric structural 
equations for the other variables on the graph. The nonparametric structural 
equation for Mi may be defined as the product of events in the conjunction 
for Mj. The nonparametric structural equation for D can be given by 

D = M 1 V--- VM„. 

The nonparametric structural equations for all other nodes on J can be 
taken to be the same as those defining the causal directed acyclic graph 
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H. Because the nonparametric structural equations for D and for each 
Mi on J are deterministic, they have no random-error term. Thus, for the 
nonparametric structural equations defining D and each Mj on J, the re- 
quirement that the nonparametric structural equation's random term ej is 
independent of all the other random terms 6j in the nonparametric struc- 
tural equations for the other variables on the graph is trivially satisfied. 
That this requirement is satisfied for the nonparametric structural equa- 
tions for the other variables on J follows from the fact that it is satisfied 
on H. □ 

In Theorem 1, sufficient conjunctions for D are constructed from some set 
of variables that, on some causal directed acyclic graph H, are all parents 
of D and thus, within the directed acyclic graph causal framework, it makes 
sense to speak of sufficient causes and minimal sufficient causes. 

Definition 5. If, on a causal directed acyclic graph, some node D with 
nonparametric structural equation D = /d(poD)£d) is such that D and all 
its parents are binary, then X\ , . . . , X n is said to constitute a sufficient cause 
for D if X\ , . . . , X n are all parents of D or complements of the parents of 
D and are such that /£>(paD, e£>) = 1 for all €jy whenever pan is such that 
X\ ■ ■ ■ X n = 1; if no proper subset of X\, . . . , X n also constitutes a sufficient 
cause for D, then X±, . . . , X n is said to constitute a minimal-sufficient cause 
for D. A set of (minimal) sufficient causes, Mi, . . . ,M n , each of which is a 
product of the parents of D and their complements, is said to be determi- 
native for some event D if, for all €e>, fD(p a Di £ d) = 1 if and only if pap is 
such that Mi V My, V • • • V M n = 1; if no proper subset of Mi, . . . , M n is also 
determinative for D, then Mi, . . . , M n is said to constitute a nonredundant 
determinative set of (minimal) sufficient causes for D. 

If, for some directed acyclic graph G there exist Aq,...,A u which satisfy 
the conditions of Theorem 1 for some node D on G so that a determinative 
set of sufficient causes for D can be constructed from Aq, . . . , A u along with 
the parents of D on G and their complements, then D will be said to admit 
a sufficient causation structure. As in Example 2, we will, in general, replace 
the Mi nodes with the conjunctions that constitute them. The node D with 
directed edges from the M, nodes is effectively an OR node. The M, nodes 
with the directed edges from the Ai nodes and the parents of D on G 
are effectively AND nodes. We call this resulting diagram a causal directed 
acyclic graph with a sufficient causation structure (or a minimal sufficient 
causation structure if the determinative set of sufficient conjunctions for D 
are each minimal sufficient conjunctions). 
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Because a causal directed acyclic graph with a sufficient causation struc- 
ture is itself a causal directed acyclic graph, the d-separation criterion ap- 
plies and allows one to determine independencies and conditional indepen- 
dencies. A minimal sufficient causation structure will often make apparent 
conditional independencies within a particular stratum of the conditioning 
variable which were not apparent on the original causal directed acyclic 
graph. The following corollary is useful in this regard. 

Corollary 1. If some node D on a causal directed acyclic graph admits 
a sufficient causation structure then conditioning on D = conditions also 
all sufficient cause nodes for D on the causal directed acyclic graph with the 
sufficient causation structure. 

Example 2 (Continued). Consider the causal directed acyclic graph 
with the minimal sufficient causation structure given in Figure 2(h). Con- 
ditioning on D = also conditions on E\E% = and E3E4 = 0, and thus, 
by the cf-separation criterion, Ei is conditionally independent of Ej given 
D = for i G {1,2}, j € {3,4}. In the causal directed acyclic graph with the 
minimal sufficient causation structure in Figure 2(iv), no similar conditional 
independence relations within the D = stratum holds. Although condition- 
ing on D = conditions also on E\E% = and AE2 = there still remains 
an unblocked path E\ — E\E% — E2 — AE2 — A between E\ and A, and so 
E\ and A are not conditionally independent given D = 0; Similarly, there 
are unblocked paths between E\ and E2 given D = and also between E2 
and A given D = 0. 

The additional variables Aq,...,A u needed to form a set of sufficient 
causes for D we will refer to as the co-causes of D. The co-causes Aq, . . . , A u 
required to form a determinative set of sufficient conjunctions for D will 
generally not be unique. For example, if D = Aq V A\E then it is also the 
case that D = B V B\E, where B = A and B\ = AqA±. Similarly, there 
will, in general, be no unique set of sufficient causes that is determinative 
for D. For example, if E\ and E2 constitute a set of sufficient causes for D 
so that D = E\ V E2, then it is also the case that E1E2, E1E2, and E1E2 
also constitute a set of sufficient causes for D, and so we could also write 
D = E1E2 V E1E2 V E\E2- It can be shown that not even nonredundant 
determinative sets of minimal sufficient causes are unique. 

Corresponding to the definition of a sufficient cause is the more philosoph- 
ical notion of a causal mechanism. A causal mechanism can be conceived of 
as a set of events or conditions which, if all present, bring about the outcome 
under consideration through a particular pathway. A causal mechanism thus 
provides a particular description of how the outcome comes about. Suppose, 
for instance, that an individual were exposed to two poisons, E\ and E2, 
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such that in the absence of E2, the poison E\ would lead to heart failure 
resulting in death; and that in the absence of E\, the poison E<i would lead 
to respiratory failure resulting in death; but such that when E\ and E2 are 
both present, they interact and lead to a failure of the nervous system again 
resulting in death. In this case, there are three distinct causal mechanisms for 
death each corresponding to a sufficient cause for D: death by heart failure 
corresponding to E1E2, death by respiratory failure corresponding to E\E^ 
and death due to a failure of the nervous system corresponding to E\E<i- It is 
interesting to note that in this case none of the sufficient causes correspond- 
ing to the causal mechanisms is minimally sufficient. Each of E1E2, E1E2 
and E1E2 is sufficient for D but none is minimally sufficient, as either E\ or 
E2 alone is sufficient for death. We will refer to a sufficient cause for D as a 
causal mechanism for D if the node for the sufficient cause corresponds to 
a variable, potentially subject to intervention, which whenever the variable 
takes the value 1, the outcome D inevitably results. 

The last example shows that the existence of a particular set of deter- 
minative sufficient causes does not guarantee that there are actual causal 
mechanisms corresponding to these sufficient causes; it only implies that 
a set of causal mechanisms corresponding to these sufficient causes cannot 
be ruled out by a complete knowledge of counterfactual outcomes. In par- 
ticular, in the previous example, the set {£1,^2} is a determinative set of 
sufficient causes that does not correspond to the actual set of causal mecha- 
nisms {E1E2, E1E2, E1E2} ■ If there are two or more sets of sufficient causes 
that are determinative for some outcome D then although the two sets of 
determinative sufficient causes are logically equivalent for prediction, we 
nevertheless view them as distinct. In such cases, some knowledge of the 
subject matter in question will, in general, be needed to discern which of 
the sets of determinative sufficient causes actually corresponds to the true 
causal mechanisms. For instance, in the previous example, we needed biolog- 
ical knowledge of how poisons brought about death in the various scenarios. 
We will, in the interpretation of our results, assume that there always exists 
some set of true causal mechanisms which forms a determinative set of suffi- 
cient causes for the outcome. The concept of synergism is closely related to 
that of a causal mechanism and is often found in the epidemiologic literature 
[11, 29, 32]. We will say that there is synergism between the effects of E\ 
and E2 on D if there exists a sufficient cause for D which represents some 
causal mechanism and such that this sufficient cause has E\ and E2 in its 
conjunction. In related work, we have developed tests for synergism, that is, 
tests for the joint presence of two or more causes in a single sufficient cause 
[36, 37]. In some of our examples and in our discussion of the various results 
in the paper, we will sometimes make reference to the concepts of a causal 
mechanism and synergism. However, all definitions, propositions, lemmas, 
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theorems and corollaries will be given in terms of sufficient causes for which 
we have a precise definition. 

The graphical representation of sufficient causes on a causal directed 
acyclic graph does not require that the determinative set of sufficient causes 
for D be minimally sufficient, nor does it require that the set of determina- 
tive sufficient causes for D be nonredundant. To expand a directed acyclic 
graph into another directed acyclic graph with sufficient cause nodes, all 
that is required is that the set of sufficient causes constitutes a determina- 
tive set of sufficient causes for D. However, a set of events that constitutes 
a sufficient cause can be reduced to a set of events that constitutes a min- 
imal sufficient cause by iteratively excluding unnecessary events from the 
set until a minimal sufficient cause is obtained. Also, a set of determinative 
sufficient causes that is redundant can be reduced to one that is nonredun- 
dant by excluding those sufficient causes or minimal sufficient causes that 
are redundant. It is sometimes an advantage to reduce a redundant set of 
sufficient causes to a nonredundant set of minimal sufficient causes. This 
is so because allowing sufficient causes that are not minimally sufficient or 
allowing redundant sufficient causes or redundant minimal sufficient causes 
can obscure the conditional independence relations implied by the structure 
of the causal directed acyclic graph. This is made evident in Example 3. 

Example 3. Consider the causal directed acyclic graph with the mini- 
mal sufficient causation structure given in Figure 3(i). Conditioning on D = 
conditions also on AB = and EF = and by the d-separation criterion, 
A and E are conditionally independent given D = 0. But now consider an 
expanded structure for this causal directed acyclic graph which involves 
only minimal sufficient causes but which allows redundant minimal suffi- 
cient causes. Define Q = BE, then AQ is a minimal sufficient cause for D 
since AQ = 1 =► AB = 1 D = 1, but A = 1 D = 1 and Q = 1 =fr D = 1. 
Now AB,AQ,EF is a determinative but redundant set of minimal sufficient 
causes for D. Figure 3(ii) gives an alternative causal directed acyclic graph 
with a minimal sufficient causation structure for the causal relationships 
indicated in Figure 3(i). In Figure 3(ii), conditioning on D = conditions 
also on AB = 0, AQ = and EF = 0, but the (i-separation criteria no longer 
imply that A and E are conditionally independent given D = 0; because 
of conditioning on D = 0, there is an unblocked path between A and E, 
namely A — AQ — Q — BE — E. Allowing the redundant minimal sufficient 
cause AQ in the minimal sufficient causation structure obscures the condi- 
tional independence relation. Similar examples can be constructed to show 
that allowing sufficient causes that are not minimally sufficient can also 
obscure conditional independence relations [35]. 
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(i) (ii) 

Fig. 3. Example illustrating that redundant sufficient causes can obscure conditional in- 
dependence relations. 



Although allowing sufficient causes that are not minimally sufficient or 
allowing redundant sufficient causes or redundant minimal sufficient causes 
can obscure the conditional independence relations implied by the structure 
of the causal directed acyclic graph, it may sometimes be desirable to include 
nonminimal sufficient causes or redundant sufficient causes. For example, as 
noted above, nonminimal sufficient cause nodes or redundant sufficient cause 
nodes may represent separate causal mechanisms upon which it might be 
possible to intervene. Further discussion of conditional independence rela- 
tions in sufficient causation structures with nonminimally sufficient causes 
and redundant sufficient causes is given in Section 6. 

Note a sufficient cause need only involve one co-cause Ai in its conjunction 
because if it involved , Ai k , then A\ x , Ai k could be replaced by 

the product A\ = A^ ■ ■ ■ Ai k . In certain cases though, it may be desirable 
to include more than one Ai in a sufficient cause if this corresponds to 
the actual causal mechanisms. If a set of variables Aq, . . . ,A U satisfying 
Theorem 1 can be constructed from functions of the random term U = of 
the nonparametric structural equation for D on G and their complements 
so that Ai = fi(U), then H can be chosen to be the graph G with the 
additional nodes U, Aq, . . . , A u and with directed edges from U into each Ai 
and from each A, into D. This gives rise to the definition, given below, of a 
representation for D. 

Definition 6. If D and all of its parents on the causal directed acyclic 
graph G are binary and there exists some set {Ai,Pi} such that each Pj is 
some conjunction of the parents of D and their complements, such that there 
exist functions /j for which Ai = /i(e£>), where ejj is the random term in the 
nonparametric structural equation for D on G and such that D = \/ i AiPi, 
then {Ai,Pi} is said to constitute a representation for D. 
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If the Ai variables are constructed from functions of the random term e jy 
in the nonparametric structural equation for D on G, then these A4 variables 
may or may not allow for interpretation, and they may or may not be such 
that an intervention on these Ai variables is conceivable. In certain cases, 
the Ai variables may simply be logical constructs for which no intervention 
is conceivable. Although in certain cases it may not be possible to intervene 
on the Ai variables, we will still refer to conjunctions of the form AjPj as 
sufficient causes for D, as it is assumed that it is possible to intervene on 
the parents of D which constitute the conjunction for Pj. 

Suppose that for some node Dona causal directed acyclic graph G, a set 
of variables Aq, . . . ,A U satisfying Theorem 1 can be constructed from func- 
tions of the random term U = eo in the nonparametric structural equation 
for D on G, so that a representation for D is given by D = V/j^i-Pi- Then, 
in order to simplify the diagram, instead of adding to G the variable U and 
directed edges from U into each Ai so as to form the minimal sufficient cau- 
sation structure, we will sometimes suppress U and simply add an asterisk 
next to each Ai indicating that the Ai variables have a common cause. 

Proposition 1. For any representation for D, the co-causes Ai will be 
independent of the parents of D on the original directed acyclic graph G. 

Proof. This follows immediately from the fact that for any representa- 
tion for D, the co-causes are functions of the random term in the nonpara- 
metric structural equation for D. □ 

If some of the sufficient causes for D are unknown, then it is not obvious 
how one might make use of Theorem 1. The theorem allowed for a sufficient 
causation structure on a causal directed acyclic graph, provided there existed 
some set of co-causes Aq, . . . ,A U . Theorem 2 complements Theorem 1 in that 
it essentially states that when D and all of its parents are binary such a set 
of co-causes always exists. The variables Aq, . . . ,A U are constructed from 
functions of the random term e£> in the nonparametric structural equation 
for D on G. Before stating and proving Theorem 1, we illustrate how the 
co-causes can be constructed by a simple example. 

Example 4. Suppose E is the only parent of D, then the structural 
equation for D is given by D = f(E,eo)- Define Aq, Ai and A2 as follows: 
let Aq(oj) = 1 if /(1,£d(u;)) = /(0,£d(u;)) = 1 and Aq(u) = otherwise; let 
A\(oj) = 1 if f(\,££)(uj)) = 1 and /(0,£d(w)) = 0, and A\(uj) = otherwise; 
and let A 2 (u) = 1 if f(l,e D (uj)) = and /(0, e D (u)) = 1, and A 2 {u) = 
otherwise. It is easily verified that D = Aq V A\E V A 2 E and that Aq, A\E 
and A 2 E constitute a determinative set of minimal sufficient causes for D. 
Note that this construction will give a determinative set of minimal sufficient 
causes for D regardless of the form of / and the distribution of er>. 
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Theorem 2. Consider a causal directed acyclic graph G on which there 
exists some node D such that D and all its parents are binary, then there 
exist variables Aq,. . . ,A U that satisfy the conditions of Theorem 1 and such 
that the sufficient causes constructed from Aq,. . . , A u along with the parents 
of D on G and their complements are, in fact, minimal sufficient causes. 

Proof. The nonparametric structural equation for D is given by D = 
f '(pap), Ed)- Suppose D has m parents on the original causal directed acyclic 
graph G. Since these parents are binary, there are 2' m values which paD 
can take. Since / maps (pod,£d) to {0,1}, each value of sd assigns to 
every possible realization oipao either or 1 through /. There are 2 2m such 
assignments. Thus, without loss of generality, we may assume that Ed takes 
on some finite number of distinct values N < 2 2 ; and so, we may write 
the sample space for Ed as Cljj = {ui\, . . . , con}, and we may use uj = uji and 
£d = £d(^i) interchangeably. The co-causes Aq, . . . ,A U can be constructed 
as follows. Let W\ be the indicator ^-e D =e D (u>i)- Let Pj be some conjunction 
of the parents of D and their complements, that is, P% = F\- • -F^., where 
each F^ is either a parent of D, say Ej or its complement Ej. For each Pj, 
let Ai = 1 if Fl ■ ■ ■ F\. is a minimal sufficient cause for D and 

Ai = y '{Wj : WjF[ ■ ■ ■ F^. is a minimal sufficient cause for D} 
3 

otherwise. Let Mj = Pj if Ai = 1, and M{ = AiPi otherwise. It must be shown 
that each Mj = AiF{ ■ • ■ F^. is a minimal sufficient cause and that the set 
of M^s constitutes a minimal sufficient cause representation for D (or more 
precisely, the set of Mj's for which Ai is not identically constitutes a 
minimal sufficient cause representation for D). We first show that each Mj = 
AiF\ ■ ■ ■ F^. is a minimal sufficient cause for D. Clearly, this is the case if 
Ai = 1. Now consider those Ai such that Ai is not identically and not 
identically 1 and suppose A^ = W[ V • • • V W^. , where each Wj is such that 
WjF{ ■ ■ ■ F^. is a minimal sufficient cause for D. If A\F\ - ■■ F^. is not a 
minimal sufficient cause, then either F% ■ ■ ■ F* = 1 D = 1 or there exists j 
such that 

-•w-r ••/•;; ,/•;;.,•••/•:. d = i. 

Suppose first that F{ - ■ ■ F^. = 1 => D = 1 then there does not exist a Wj 
such that WjFl ■ ■ -F^. is a minimal sufficient cause for D; but this contra- 
dicts Ai is not identically 1. On the other hand, if there exists j such that 
AiF[ ■ ■ ■ Fj^FJ^ ■ ■ ■ F^. =>■ D = 1, then it is also the case that 



it ,/•;;.,•••/•; d = i, 
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since Ai is simply a disjunction of the W"s. However, it would then follow 
that W[F[ ■ ■ ■ F^. is not a minimal sufficient cause for D. But this contradicts 
the definition of W{. Thus, A; L F\ ■ ■ ■ F^. must be a minimal sufficient cause for 
D. It remains to be shown that the set of Mj's for which Ai is not identically 
constitutes a minimal sufficient cause representation for D. We must show 
that if D = 1, then there exists a Mi = AiPi for which Mj = 1. Now D is a 
function of (ed, Pi, . . . , E m ), so let (e* d ,E\, . . . , E^) be any particular value 
of (ed,Ei, . . . , E m ) for which D = 1. Consider the set {Pi, . . . , E m }. If for 
any j, 

£D = £jb, Ei = E 1 , . . . ,Ej-i = Ej_i, 

Ej+i = Ej+ii ■ ■ • , E m = E* m => D = l, 

remove Ej from {Pi,...,P m }. Continue to remove those Ej from this set 
which are not needed to maintain the implication D = 1. Suppose the set 
that remains is {E^ , . . . , P/i s }, then either we have E^ = ET , . . . , Eh s = 
E^ s => D = 1 or we have 

E h 1 =El 1 ,..., E hs = El s ^> D = 1 

and 

£d = £z), E hl =El 1 ,...,Ef ls =El s =>■ D = l. 

If J E/ 1 , 1 = P^, • . . ,E hs = El s =$>■ D = 1, then if we define Fj as the indi- 
cator Fj = =£?* ), Fi---Fs is a minimal sufficient cause for D and 

there thus exists an i, such that Pj = F±---Fs and Afj = Pj, and when 
P % =E* hl ,...,E hs = E* hs , we have M = 1. If E hl = E* hl , . . . ,E hs = E* hg & 
D = 1 but ed = e* D ,E hl = E% E hs = E* hs => D = 1, then if we define 
Fj as the indicator l(E h =e* ), ^-e D =e* E\ - ■ ■ F$ is a minimal sufficient cause 
for D; and there exists an i such that Mj = ^4j Pi and Pj = F\ ■ ■ ■ F$; and 
£d = £jb =^ = 1 , such that 

£d = £*Di E hl = ,. .., E hs = El s => Mi = 1. 

We have thus shown when D = 1, there exists an Mj such that Mj = 1 and 
so the Mj's constitutes a minimal sufficient cause representation for D. □ 

The variables Ai constructed in Theorem 2, along with their correspond- 
ing conjunctions Pj of the parents of D and their complements, we define 
below as the canonical representation for D. It is easily verified that the 
co-causes and representation constructed in Example 4 is the canonical rep- 
resentation for D in that example. 
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Definition 7. Consider a causal directed acyclic graph 67, such that 
some node D and all of its parents are binary. Let fi^ be the sample space 
for the random term ejj in the nonparametric structural equation for D 
on G. The conjunctions Pi = F{ ■ ■ ■ F^ . , where each is either a parent of 
D or the complement of a parent of D, along with the variables Ai con- 
structed by Ai = 1 if F{ ■ ■ ■ F* is a minimal sufficient cause for D and 

A i = Vu, j en D { 1 e D =en(uj) : l e D =e D {u> 3 )F{ • ' • K f is a minimal sufficient cause 
for D}; otherwise, is said to be the canonical representation for D. 

As noted above, there will in general exist more than one set of co-causes 
Aq , . . . , A u , which together with the parents of D and their complements can 
be used to construct a sufficient cause representation for D. The set of A^s in 
the canonical representation constitutes only one particular set of variables 
which can be used to construct a sufficient cause representation. If D has 
three or more parents, examples can be constructed in which the canonical 
representation is redundant. Examples can also be constructed to show that 
when the canonical representation is redundant, it is not always uniquely 
reducible to a nonredundant minimal sufficient cause representation. Al- 
though the canonical representation will not always be nonredundant, it 
does however guarantee that for a binary variable with binary parents, a 
determinative set of minimal sufficient causes always exists. The canonical 
representation in a sense "favors" conjunctions with fewer terms. As can 
be seen in the simple illustration given in Example 4, the canonical repre- 
sentation will never have Ai = 1, for some conjunction Pi, when there is a 
conjunction Pj with Aj = 1 and such that the components of Pj are a subset 
of those in the conjunction for Pj. 

4. Monotonic effects and minimal sufficient causation. Minimal suffi- 
cient causes for a particular event D may have present in their conjunction 
the parents of D or the complements of these parents. In certain cases, no 
minimal sufficient cause will involve the complement of a particular parent 
of D. Such cases closely correspond to what will be defined below as a posi- 
tive monotonic effect. Essentially, a positive monotonic effect will be said to 
be present when a function in a nonparametric structural equation is non- 
decreasing in a particular argument for all values of the other arguments of 
the function. In this section, we develop the relationship between minimal 
sufficient causation and monotonic effects. 

Definition 8. The nonparametric structural equation for some node 
D on a causal directed acyclic graph with parent E can be expressed as 
D = f{pa D ,E,€£>), where pd D are the parents of D other than E; E is said to 
have a positive monotonic effect on D if, for all pa D and €£>, f(pa>DiEi,eD) > 
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f(pd D ,E2,eD) whenever E\ > Ei- Similarly, E is said to have a negative 
monotonic effect on D if, for all pa D and eo, f(pd D ,Ei,eD) < /(p^d, ^2, £d) 
whenever E\ > E2. 

Note that this notion of a monotonic effect is somewhat stronger than 
Wellman's qualitative probabilistic influence [41]. See [38, 39] for further 
discussion. 

Theorem 3. If E is parent of D and if D and all of its parents are bi- 
nary, then the following are equivalent: (i) E has a positive monotonic effect 
on D; (ii) there is some representation for D which is such that none of the 
representation's conjunctions contain E; (hi) the canonical representation 
of D, \/ i AiPi ; is such that no conjunction Pi contains E. 

Proof. We see that (iii) implies (ii) because the representation required 
by (ii) is met by the canonical representation of D, as constructed in The- 
orem 2. To show that (ii) implies (i), we assume that we have a repre- 
sentation for D such that D = \JiAiPi, where each P, is some conjunc- 
tion of the parents of D and their complements but does not contain E. 
If f{pa,Di E, €d) = 1, then f(pa D , E, en) = 1 because D = \J t AiPi and none 
of the Pi involve E; from this, (i) follows. To show that (i) implies (iii) we 
prove the contrapositive. Suppose that the canonical representation of D, 
{Ai,Pi}, is such that there exists a Pi which contains E in its conjunction. 
Then there exists some value e* D of Ed and some conjunction of the parents 
of D and their complements, say F\-- ■ F n , such that WiF\ ■ ■ ■ F n E consti- 
tutes a minimal sufficient cause for D, where Wi = l^ £ * D=£D y Let pa*D take 
the values given by F\- • -F n . This may not suffice to fix pd* D , but there 
must exist some value of the remaining parents of D other than E which, in 
conjunction with WiF\ ■ ■ ■ F n E, gives D = 0; for if there were no such values 
of the other parents, then WiF\ ■ ■ ■ F n itself would be sufficient for D, and 
WiF\ ■ ■ ■ F n E would not be a minimal sufficient cause for D. Let pa^ be 
such that pa * D and E together with e* D give D = 1 , but pa * D and E with e* D 
give D = 0. Then, / (pa* D , E , e* D ) = 1, but f(pd* D ,E,e* D ) = 0, and thus, (i) 
does not hold. This completes the proof. □ 

5. Conditional covariance and minimal sufficient causation. When two 
binary parents of some event D have positive monotonic effects on D, it is 
in some cases possible to determine the sign of the conditional covariance of 
these two parents. In general, even in the setting of monotonic effects, the 
conditional covariance may be of either positive or negative sign; however, 
when additional knowledge is available concerning the minimal sufficient 
causation structure of D, it is often possible to determine the sign of the 
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conditional covariance of two parents of D. Theorem 4 gives conditions under 
which the sign of the conditional covariance can be determined. Theorems 
5 and 6 extend the conclusions of Theorem 4 to certain cases concerning 
the conditional covariance of two variables that may not be parents of the 
conditioning variable. The proof of Theorem 4 is suppressed; the proof in- 
volves extensive but routine algebraic manipulation and factoring (details 
are available from the authors upon request). 

Theorem 4. Suppose that E\ and E2 are the only parents of D on some 
causal directed acyclic graph, that E±, E2 and D are all binary and that both 
E\ and E2 have a positive monotonic effect on D. Then, for any repre- 
sentation for D such that D = AqV A\E\ V A2E2 V A3E1E2, the following 
hold: 

(i) IfA = 0, then Cov(E 1 ,E 2 \D) < 0. 

(ii) If Aq = 0, A\ and A2 are independent and E\ and E2 are indepen- 
dent, then Cov (E 1 ,E 2 \D) < 0. 

(hi) IfAi = 1 or A 2 = 1, then Cov{Ei,E 2 \D) < provided Cov(E 1 ,E 2 ) < 

0. 

(iv) If Ax = 1 or A 2 = 1, then Cov(E l ,E 2 \D) = 0. 

(v) If A x =0orA 2 = 0, then Cov(E 1 ,E 2 \D) > provided Cov(E 1 ,E 2 ) > 

0. 

(vi) IfAi =0 orA 2 = 0, then Cov(£i, E 2 \D) < provided Cov(E 1 ,E 2 ) < 

0. 

(vii) // A 3 = 0, then Cov{E 1 ,E 2 \D) < provided Cov(E 1 ,E 2 ) < 0. 

(viii) IfAs = 0, A\ and A2 are independent, E\ and E2 are independent 
and also Aq is independent of either A\ or A2, then Cov{E\,E2\D) = 0. 

Note that parts (i)-(viii) of Theorem 4 all require some knowledge of a 
sufficient cause representation for D, that is, that Aq = 0, A\ = 1 or A\ = 0, 
etc. Conclusions about the sign of the conditional covariance cannot be 
drawn from Theorem 4 without some knowledge of a sufficient causation 
structure. In general, this knowledge of a sufficient causation structure would 
come from prior beliefs about the actual causal mechanisms for D. As can 
be seen from Theorem 4, if no knowledge of the sufficient causes is available, 
the conditional covariances Cov(i?i, -E2I-D) and Cov(E\, E 2 \D) may be of 
either sign, even if E\ and E2 have positive monotonic effects on D. For 
example, if E\ and E2 have positive monotonic effects on D and (v) holds 
then Cov(.Ei, E 2 \D) > 0; but if E\ and E2 have positive monotonic effects 
on D and (i) holds, then Cow{E 1 ,E 2 \D) < 0. 

If E\ and E2 are the only parents of D, possibly correlated due to some 
common cause C, and have positive monotonic effects on D then the minimal 
sufficient causation structure for the causal directed acyclic graph is that 
given in Figure 4. 
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Fig. 4. Minimal sufficient causation structure when E\ and E2 have positive monotonic 
effects on D. 

Recall the asterisk is used to indicate that Aq, A\, A2 or A3 may have 
a common cause U. If one of Aq, A\, A2 or ^3 is identically or 1, then 
Theorem 4 may be used to draw conclusions about the sign of the condi- 
tional covariance Cav(E\, E2\D). For example, if one believes that there is 
no synergism between E\ and E2 in the actual causal mechanisms for D 
then A3 = 0; if this holds, then parts (vii) and (viii) of Theorem 4 can be 
used to determine the sign of the conditional covariance. Theorem 4 has an 
obvious analogue if one or both of E± or E2 have a negative monotonic effect 
on D. If D has more than two parents, but if the two parents, E\ and E2, 
are independent of all other parents of D, then the causal directed acyclic 
graph can be marginalized over these other parents, and Theorem 4 could 
be applied to the resulting causal directed acyclic subgraph. 

Some of the conclusions of Theorem 4 require knowing the sign of Cov(-Ei, E2) 
and Proposition 2 below (proved elsewhere [39]) relates the sign of Cov(Ei, E2) 
to the presence of monotonic effects. In order to state this proposition and 
to allow for the development of extensions to Theorem 4, we need a few 
additional definitions. 

Definition 9. An edge on a causal directed acyclic graph from X to 
Y is said to be of positive (negative) sign if X has a positive (negative) 
monotonic effect on Y. If X has neither a positive monotonic effect nor a 
negative monotonic effect on Y, then the edge from X to Y is said to be 
without a sign. 

Definition 10. The sign of a path on a causal directed acyclic graph 
is the product of the signs of the edges that constitute that path. If one of 
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Fig. 5. Examples requiring extensions to Theorem 4- 

the edges on a path is without a sign, then the sign of the path is said to be 
undefined. 

Definition 11. Two variables X and Y are said to be positively mono- 
tonically associated if all directed paths between X and Y are of positive 
sign, and all common causes C% of X and Y are such that all directed paths 
from Ci to X not through Y are of the same sign as all directed paths from 
Cj to Y not through X; the variables X and Y are said to be negatively 
monotonically associated if all directed paths between X and Y are of neg- 
ative sign, and all common causes C% of X and Y are such that all directed 
paths from Cj to X not through Y are of the opposite sign as all directed 
paths from Cj to Y not through X . 

Proposition 2. If X and Y are positively monotonically associated, 
then Cov(X,Y) > 0. If X and Y are negatively monotonically associated, 
then Cov(X,Y) < 0. 

Rules for the propagation of signs have been developed elsewhere [38, 
39, 41] and, as seen from Proposition 2, are useful for determining the sign 
of covariances; however, as will be seen below, rules for deriving the sign 
of conditional covariances are more subtle. Theorem 4 concerns the condi- 
tional covariance of two parents of the node D. However, often what will 
be desired is the sign of the conditional covariance of two variables which 
are not parents of the conditioning node. For example, in the coaggregation 
problem discussed in the Introduction, we wanted to draw conclusions about 
Cov(-P2;-E>i|-Pi = 1), but neither Pi nor B\ are parents of P\ in Figure 1. In 
the remainder of the paper we will thus extend Theorem 4 so as to allow for 
application to two variables, say F and G, which are not parents of the con- 
ditioning node D. The variables F and G might be ancestors, descendants 
or have common causes with the parents, E\ and E2, of D. Consider, for 
example, the causal directed acyclic graphs in Figure 5. 

If we were interested in the sign of Cov(F, G\D) in Figures 5(i)-(iii), then 
clearly Theorem 4 is insufficient. Theorems 5 and 6 below will allow us to 
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extend the conclusions of Theorem 4 to examples such as those in Figure 5 
and to certain other cases involving two variables that may not be parents 
of the conditioning variable. Lemmas 1-5 below will be needed in the proofs 
and application of Theorems 5 and 6. Lemmas 1 and 2 are consequences 
of Theorems 1 and 2 in the work of Esary, Proschan and Walkup [5] . Lem- 
mas 3-5 are proved elsewhere in related work concerning the properties of 
monotonic effects [38]. 

Lemma 1. Let f and g be functions with n real-valued arguments, such 
that both f and g are nondecreasing in each of their arguments. If 
X = (Xi, . . . ,X n ) is a multivariate random variable with n components, 
such that each component is independent of the other components, then 
Cov(f(X),g(X))>0. 

Lemma 2. If F and G are binary and u\ and ui are nondecreasing 
functions, then sign(Cov(ui(F),ii2(G))) = sign(Cov(F, G)). 

Lemma 3. Let X denote some set of nondescendants of A that blocks 
all backdoor paths from A to Y . If all directed paths between A and Y are 
positive, then P(Y > y\a,x) and K[y\a,x] are nondecreasing in a. 

Lemma 4. Suppose that E is binary. Let Q be some set of variables 
which are not descendants of F nor of E, and let C be the common causes 
of E and F not in Q. If all directed paths from E to F (or from F to E) 
are of positive sign and all directed paths from C to E not through {Q,F} 
are of the same sign as all directed paths from C to F not through {Q,E}, 
then E[F\E,Q] is nondecreasing in E. 

Lemma 5. Suppose that E is not a descendant of F. Let Q be some set 
of nondescendants of E that block all backdoor paths from E to F and let D 
be a node on a directed path from E to F such that all backdoor paths from D 
to F are blocked by {E, Q}. If all directed paths from E to F, except possibly 
those through D, are of positive sign, then ¥,[F\D,Q,E] is nondecreasing in 
E. 

Obvious analogues concerning negative signs hold for all of the lemmas 
above. Theorem 5 below will allow us to determine the sign of the conditional 
covariance of F and G on graphs like those in Figure 5, provided there 
are appropriate signs on the edges. The conclusion of Theorem 5 concerns 
the equality of the sign of two conditional covariances, Cov(F, G\D) and 
Cov(E\,E2\D). The theorem itself does not require knowledge of a sufficient 
causation representation and thus applies to general causal directed acyclic 
graphs. However, to draw conclusions about the sign of Cov(Ei,E2\D), one 
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must still appeal to Theorem 4 which does require some knowledge of a 
sufficient causation representation. 

Theorem 5. Suppose that E\, E2 and D are binary variables, that E\ 
and E2 are parents of D, that F and G are d-separated given {E\,E2,D}, 
that F and {E2,D} are d-separated given E\ and that G and {E\, D} are d- 
separated given E2. If Cov(F,Ei) > and Cov(G,£ , 2) > then 
sign{Cov(F,G\D)) = sign(Cov{E 1 ,E 2 \D)). 

Proof. Conditioning on E\ and E2, we have 

Cov(F, G\D) = E[Cov(F, G\D, E u E 2 )\D] 

+ Cov(M[F\D,E 1 ,E 2 ],M[G\D,E 1 ,E 2 ]\D). 

The first expression is since F and G are d-separated given {Ei,E 2 ,D}. 
Furthermore, since F and {£"2,-0} are d-separated given E\ and G and 
{E\,D} are d-separated given E 2 , the second expression can be reduced to 
Cov(E[F|£i],E[G|£ 2 ]|£>)- Thus, 

Cov(F,G\D) = Cov(E[F\E 1 ],E[G\E 2 ]\D). 

If Cov(F, Ei) > and Cov(G7, E 2 ) > then, since E\ and E2 are binary, we 
have that E[-F|l£i] is nonincreasing in E\ and E[G|£?2] is nonincreasing in E2, 
and so by Lemma 2, sign(Cov(E[F|J5i],E[G|E 2 ]|D)) = sign(Cov(J5i, E 2 \D)). 
We thus have 

sign(Cov(F,G|£>)) = sign(Cov(F 1 , F 2 \D)) 
and this completes the proof. □ 

Note Theorem 5 requires that Cov(F, E±) > and Cov(G, E 2 ) > 0; Propo- 
sition 2 can be used to check whether these covariances are nonnegative; that 
is, the covariances will be nonnegative if F and E\ are positively monoton- 
ically associated and if G and E2 are positively monotonically associated. 

Example 5. Note that the graphs in Figures 5(i) and (ii) satisfy the 
(i-separation restrictions of Theorem 5. In Figure 5(i), G is an ancestor of 
E2 whereas F is related to E\ as a descendant and by a common cause. 
In Figure 5 (ii) , F is a descendant of E\ and G is related to E2 both as an 
ancestor and by a common cause. The d-separation restrictions of Theorem 5 
would still hold in Figures 5(i) and (ii) if F and E\ or G and E2 had multiple 
common causes or if there were several intermediate variables between E\ 
and F and between G and E 2 . 
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Note, however, that Theorem 5 requires that F be d-separated from 
{E2,D} given E\ and that G be (i-separated from {Ei,D} given E^. Thus, 
if F or G were a descendant of D, these assumptions would be violated. 
Consequently, Theorem 5 could not be applied to the diagram in Figure 
5(iii). Nor could Theorem 5 be applied to the paper's introductory motiva- 
tion to draw conclusions about the sign of Cov(P2, Bi\P\ = 1) for the graph 
in Figure 1, since B\ is a descendant of the conditioning variable P\. 

Theorem 6 below gives a result that allows for F and G to be descendants 
of D. Before stating this result we note, however, that Theorem 5 is restricted 
in yet another way. Theorem 5 required that F and G be (i-separated given 
{E\,E2,D}. If F and G have common causes then the d-separation restric- 
tions required by Theorem 5 will again, in general, not hold. Theorem 5 
would thus not apply to the graphs given in Figure 6. 

Theorem 6 gives a result similar to Theorem 5 which allows for F or G to 
be descendants of D and allows also for F and G to have common causes. As 
with Theorem 5, the conclusion of Theorem 6 concerns the equality of the 
sign of two conditional covariances and the theorem itself does not require 
knowledge of a sufficient causation representation. But once again, to draw 
conclusions about the sign of Cov(F, G\D) using Theorem 6, one must know 
the sign of Cov(-E7i, E%\D) and thus, appeal must again be made to Theorem 
4 which does require some knowledge of a sufficient causation representation. 

Theorem 6. Suppose that E\, E2 and D are binary variables, that E\ 
and E2 are parents of D, that F and G are d-separated given {E\,E2,D, Q}, 
where Q is some set of common causes of F and G ( each component of 
which is univariate and independent of the other components in Q) that F 
and E2 are d-separated given {E\,D,Q}, that G and E\ are d-separated 
given {E2,Q,D}, that Q and {E\,E2} are d-separated given D and that Q 




( i) (»> 
Fig. 6. Examples in which F and G have a common cause. 
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and D are d-separated. Suppose also that E[F\Ei,D,Q] is nondecreasing in 
E\ and that E[G\E 2 , D,Q] is nondecreasing in E 2 . If Cav(Ei, E 2 \D) > 0, 
and for each element of Qi of Q, every directed path from Qi to F is the 
same sign as every directed path from Qi to G, then Cov(F,G\D) > 0. If 
Cov(Ei, E 2 \D) < 0, and for each element of Qi of Q, every directed path 
from Qi to F is the opposite sign as every directed path from Qi to G, then 
Cov(F, G\D) < 0. 

Proof. We will prove the first of the results above; the proof of the 
second is similar. Conditioning on {E±, E 2 , Q}, we have 

Cov(F, G\D) = E[Cov(F, G\D, Q, E l ,E 2 )\D] 

+ Cov(E[F\D,Q,E 1 ,E 2 ],E[G\D,Q,E 1 ,E 2 ]\D). 

The first expression is since F and G are d-separated given {E±, E 2 ,Q, D}. 
We can furthermore re- write the second expression as follows: 

Cov(F,G\D) 

= Cov(E[F\D,Q,E 1 ,E 2 ],E[G\D,Q,E 1 ,E 2 ]\D) 

= E[Cov(E[F\D,Q,E 1 ,E 2 ],E[G\D,Q,E 1 ,E 2 ]\Q,D)\D] 

+ Cov(E[E[F\D,Q,E 1 ,E 2 ]\Q,D},E[E[G\D,Q,E 1 ,E 2 \\Q,D}\D). 

We will show that each of these two expressions is positive. Since F and 
E 2 are d-separated given {Et,D,Q}, E[F\D, Q, E u E 2 ] = E[F\E X , D, Q]; and 
since G and E\ are d-separated given {E 2 , D, Q}, E[G\D, Q, E\, E 2 ] = 
E[G\E 2 , D, Q\. By assumption, we have that E[F\Ei, D,Q] is nondecreas- 
ing in Ei and that E[G\E 2 ,D,Q] is nondecreasing in E 2 . For fixed q, 

Cov(E[F\D,Q = q,E 1 ,E 2 ],E[G\D,Q = q,E 1 ,E 2 ]\Q = q,D) 
= Cav(E[F\E u D, Q = q] ,E[G\E 2 ,D, Q = q]\Q = q, D) 
= Cov(E[F\Ei,D, Q = q],E[G\E 2 ,D, Q = q] \D), 

since Q and {E±,E 2 } are d-separated given D. And since E[F\E\, D, Q = q] 
is nondecreasing in E\ and E[G\E 2 ,D,Q = q] is nondecreasing in E 2 , by 
Lemma 2, Cov(E[F\Ex, D,Q = q],E[G\E 2 ,D,Q = q]\D) = Cov(E u E 2 \D) > 
0. Thus, we have that Cov(E[F|£>, Q = q, E u E 2 ],E[G\D, Q = q, E u E 2 ]\Q = 
q,D) > for all q and taking expectations over Q we have E[Cov(E[F\D,Q, 
E 1 ,E 2 },E[G\D,Q,E 1 ,E 2 ]\Q,D)\D] > 0. We have shown that the first of the 
two expressions above is nonnegative. We now show that the second expres- 
sion 



Cov(EpE[F|D, Q, E U E 2 ] \Q, D],E[E[G\D, Q, E U E 2 ] \Q, D] \D) 
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is also nonnegative. As before, E[F\D, Q, E 1 ,E 2 ] = E[F\Ei,D, Q] andE[G|L>, 
Q,Ei,E 2 ] = E[G\E 2 , D,Q]. By hypothesis, for each element of Qi of Q ev- 
ery directed path from Qi to F is the same sign as every directed path 
from Qi to G; without loss of generality, we may assume that the sign of 
all of these directed paths are positive. By Lemma 3 with X = {Ei, D} and 
X = {E 2 ,D}, respectively, E[F\Ei,D, Q = q] and E[G\E 2 ,D, Q = q] are both 
nondecreasing in each dimension of q. Note that we may apply Lemma 3 
because if there were any backdoor paths from Q to F or to G, then Q 
would have some parent which would also be a common cause of F and G 
and thus also a member of the set Q, but this would violate the assumption 
that the members of Q were independent of one another. Furthermore, 

E[E[F\D,Q = q,E 1 ,E 2 ]\Q = q,D]=E[E[F\E 1 ,D,Q = q]\Q = q,D] 

= E[E[F\E 1 ,D,Q = q]\D] 

and similarly, E[E[G\D,Q = q,E 1 ,E 2 ]\Q = q,D}= E[E[G\E 2 ,Q = q]\D] = 
E[E[G\E 2 ,Q = q]\Q = q, D] since Q and {E\,E 2 } are (i-separated given D. 
Thus, 

E[E[F\D, Q = q, E U E 2 ) \Q = q,D}= E[E[Fj£ l5 D, Q = q]\D] 

and 

E[E[G\D, Q = q, E U E 2 ] \Q = q,D]= E[E[G\E 2 ,D, Q = q]\D] 

are both nondecreasing in each dimension of q from which it follows by 
Lemma 1 that Cov(E[E[F\D,Q,E 1 ,E 2 ]\Q,D],E[E[G\D,Q,E 1 ,E 2 ]\Q,D]) > 
0. Since Q and D are (i-separated we also have 

Cov(E\E[F\D,Q,E 1 ,E 2 ]\Q,D),E[E[G\D,Q,E l ,E 2 }\Q,D}\D) 

= Cov(E[E[F\D, Q, E X ,E 2 \ \Q, D],E[E[G\D, Q, E U E 2 ] \Q, D]) > 

and this completes the proof. □ 

Note the application of Theorem 6 requires that E[F\Ei,D,Q] is nonde- 
creasing in Ei and that E[G\E 2 ,D,Q] is nondecreasing in E 2 . Either of the 
following will suffice for E[F\Ei, D,Q] to be nondecreasing in E\ (similar 
remarks hold for E[G\E 2 ,D,Q]): (i) F and D are ci-separated given {Q,E\} 
and F and E\ are positively monotonically associated or (ii) if F is a descen- 
dant of E\ and D, F and E± do not have common causes and all directed 
paths from E\ to F not through D are of positive sign. Condition (i) suffices 
by Lemma 4; condition (ii) suffices by Lemma 5. 

Example 6. Although the graphs in Figure 5 (hi) and in Figure 6 do 
not satisfy the d-separation restrictions of Theorem 5, it can be verified that 
the these graphs do satisfy the d-separation restrictions of Theorem 6. 
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At first glance, the (i-separation restrictions of Theorems 5 and 6 appear 
to severely limit the settings to which conclusions about conditional covari- 
ances can be drawn. The d-separation requirements are, in fact, somewhat 
less restrictive than they may first seem. We argue that the d-separation 
restrictions of either Theorems 5 or 6 will apply to most graphs in which 
neither F nor G is a cause of the other (though the restrictions on the set of 
common causes Q, if any, of F and G in Theorem 6 are more substantial). 
Theorem 5 requires (i) that F and G are d-separated given {E\,E 2 ,D} and 
(ii) that F and {E2,D} are d-separated given E\ and that G and {Ei,D} 
are (i-separated given E 2 . In Theorems 5 and 6 (and Figures 5 and 6), F 
was either an ancestor or descendant of or shared a common cause with E\ ; 
and G was either an ancestor or descendant of or shared a common cause 
with Ei- The d-separation restrictions essentially just require that F and G 
are sufficiently structurally separated so that (i) F and G are only asso- 
ciated because of {E±, E2, D} and (ii) F is associated with {E 2 ,D} only 
through E±; and G is associated with {E±,D} only through E 2 . If neither 
F or G is a descendant of D, then the conditions will, in general, only be 
violated if one of F or G is a cause of the other or if they share a common 
cause. Theorem 6, however, allowed for F and G to have common causes 
Q. The restrictions on Q in Theorem 6 were somewhat substantial, but the 
restrictions on F and G are very similar to those of Theorem 5 except that 
they were made conditional on Q. Theorems 5 and 6 will thus apply to a 
wide range of graphs, as can also be seen by the variety of graphs in Figures 
5 and 6, in which neither F nor G is a cause of the other. 

As is clear from Proposition 2, rules concerning the propagation of signs 
were sufficient to determine the sign of the covariance between two variables. 
For conditional covariances, the principles guiding such a determination are 
more subtle. The principle behind the proofs of Theorems 5 and 6 was to 
partition the conditional covariance into two components 

Cov(F, G\D) = E[Cov(F, G\D, Q, E 1 ,E 2 )\D] 

+ Cov(E[F\D,Q,E 1 ,E 2 ],M[G\D,Q,E 1 ,E 2 ]\D) 

with Q = in the proof of Theorem 5. The d-separation restrictions allowed 
for the conclusion that Cov(F, G\D, Q, E\, E 2 ) = 0. Additional <i-separation 
restrictions were needed so that the second expression Cov(E[F\D, Q, E\,E 2 ], 
E[G\D , Q , E\, E 2 ]\D) could be reduced to a form in which the sign of this 
conditional covariance could be determined from signed edges and an appeal 
to Theorem 4. 

Having stated Theorem 6, we can now return to the motivating example 
presented in the paper's Introduction. 
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Fig. 7. Causal directed acyclic graph with signed edges, under the null hypothesis of no 
familial coaggregation. 

Example 7. In the motivating example described in Figure 1, with data 
available only on P±,P2,Bi,B2, we wish to test the null hypothesis of no fa- 
milial coaggregation (i.e., the null hypothesis that there are no directed edges 
emanating from F). Note that Hudson et al. [10] consider an alternative ap- 
proach using a threshold model with additive multivariate normal latent 
factors. Here we use a sufficient causation approach. Given the substantive 
knowledge that for no subset of the population do the genetic causes G p and 
Gb of P and B prevent disease and that for no subset of the population do 
the environmental causes E\ and E2 of B and P prevent either disease, we 
have that E\ and E2 have positive monotonic effects on Pi and B\ and on 
P2 and B2, respectively, and that Gp has a positive monotonic effect on P± 
and on P2 and that Gb has a positive monotonic effect on B± and on B2. 
The null hypothesis of no familial coaggregation can then be represented by 
the signed causal directed acyclic graph given in Figure 7. 

If, in addition, using prior biological knowledge, it is assumed that there 
is no synergism between E\ and Gp in the sufficient cause sense, then we 
can apply part (vii) of Theorem 4 and, under the null hypothesis of no fa- 
milial coaggregation, we have that Cov(i?i, Gp |Pi = 1) < 0. By Theorem 6 
with Q = we have that sign(Cov(I?i, P2I-P1 = 1)) = sign(Cov(£'i, Gp\P\ = 
1)). Under the null hypothesis of no familial coaggregation we thus have 
sign(Cov(£i,P 2 |-Pi = 1)) =sign(Cov(Ei,Gp|Pi = 1)) < 0. Thus, as claimed 
in the Introduction, a test of the null Cov(I?i, P2I-P1 = 1) < is a test of 
no familial coaggregation under the assumption of no synergism between 
E\ and Gp. Note that by the symmetry of this example, a test of the null 
Cov(-B2i -P1I-P2 = 1) < is a test of no familial coaggregation under the as- 
sumption of no synergism between E2 and Gp. The development of a theory 
of minimal sufficient causation on directed acyclic graphs provided the con- 
cepts necessary to derive these results. 

6. Discussion. In this paper we have incorporated notions of minimal 
sufficient causation into the directed acyclic graph causal framework. Doing 
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so has provided a clear theoretical link between two major conceptualizations 
of causality. Causal directed acyclic graphs with minimal sufficient causation 
structures have furthermore allowed for the development of rules governing 
the sign of conditional covariances and of rules governing the presence of 
conditional independencies which hold only in a particular stratum of the 
conditioning variable. 

The present work could be extended in a number of directions. Theory 
could be developed concerning cases in which a sufficient causation struc- 
ture involves redundant sufficient causes or sufficient causes that are not 
minimally sufficient. Specifically, it might be possible to develop a system 
of axiomatic rules which govern conditional independencies within strata 
of variables on a causal directed acyclic graph with a sufficient causation 
structure, to furthermore demonstrate the soundness and completeness of 
this axiomatic system and to construct algorithms for applying the rules 
to identify all conditional independencies inherent in the graph's structure. 
Another direction of further research might involve the incorporation of the 
AND and OR nodes that arise from sufficient causation structures into other 
graphical models such as summary graphs [4], MC-graphs [12], chain graph 
models [2, 6, 14, 15, 16, 23, 34, 42] and ancestral graph models [24]. Finally, 
further work could be done extending the results of Theorem 4 to yet more 
general settings than those of Theorems 5 and 6. 
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